-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: state sync #1026
base: v1.5-dev
Are you sure you want to change the base?
feat!: state sync #1026
Conversation
WalkthroughThis pull request introduces a new Changes
Sequence Diagram(s)sequenceDiagram
participant C as Client (gRPC/Socket)
participant P as Proxy Client
participant R as Routed Client
participant S as Socket Client
participant A as ABCI Application
C->>P: FinalizeSnapshot(ctx, req)
P->>R: Delegate FinalizeSnapshot request
R->>S: Forward FinalizeSnapshot call
S->>A: Call FinalizeSnapshot (with ToRequestFinalizeSnapshot)
A-->>S: Return ResponseFinalizeSnapshot (+ error)
S-->>R: Pass back response
R-->>P: Relay response
P-->>C: Return ResponseFinalizeSnapshot
Poem
Tip 🌐 Web search-backed reviews and chat
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
abci/types/application.go (1)
102-104
: Consider adding documentation for the base implementation.While the empty implementation is appropriate for a base class, adding a comment explaining that this is a default no-op implementation would improve clarity.
+// FinalizeSnapshot provides a no-op implementation for the StateSyncer interface func (BaseApplication) FinalizeSnapshot(_ context.Context, _req *RequestFinalizeSnapshot) (*ResponseFinalizeSnapshot, error) { return &ResponseFinalizeSnapshot{}, nil }
proto/tendermint/abci/types.proto (1)
642-644
: Consider adding status field to ResponseFinalizeSnapshot.The empty response provides no feedback mechanism for the application to indicate success/failure of the finalization process.
Consider adding a status field:
message ResponseFinalizeSnapshot { + bool success = 1; // Indicates if snapshot finalization was successful + string error = 2; // Error message if finalization failed }abci/example/kvstore/kvstore.go (1)
583-598
: Implementation looks good, but error messages could be more descriptive.The FinalizeSnapshot implementation correctly verifies both height and app hash, with proper thread safety using mutex. However, the error messages could be more informative.
Consider enhancing the error messages with more details:
- return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot height mismatch") + return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot height mismatch: expected %d, got %d", + app.LastCommittedState.GetHeight(), req.LightBlock.SignedHeader.Header.Height) - return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot apphash mismatch") + return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot apphash mismatch: expected %x, got %x", + app.LastCommittedState.GetAppHash(), req.LightBlock.SignedHeader.Header.AppHash)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
abci/types/types.pb.go
is excluded by!**/*.pb.go
go.sum
is excluded by!**/*.sum
📒 Files selected for processing (30)
abci/client/grpc_client.go
(1 hunks)abci/client/mocks/client.go
(1 hunks)abci/client/routed_client.go
(1 hunks)abci/client/socket_client.go
(1 hunks)abci/example/kvstore/kvstore.go
(1 hunks)abci/types/application.go
(2 hunks)abci/types/messages.go
(1 hunks)abci/types/mocks/application.go
(1 hunks)config/config.go
(0 hunks)config/toml.go
(0 hunks)go.mod
(3 hunks)internal/proxy/client.go
(1 hunks)internal/statesync/chunks.go
(2 hunks)internal/statesync/mocks/stateprovider.go
(1 hunks)internal/statesync/reactor.go
(2 hunks)internal/statesync/reactor_test.go
(1 hunks)internal/statesync/stateprovider.go
(5 hunks)internal/statesync/syncer.go
(4 hunks)internal/statesync/syncer_test.go
(4 hunks)light/client.go
(1 hunks)light/client_test.go
(1 hunks)proto/tendermint/abci/types.proto
(5 hunks)spec/abci++/api.md
(7 hunks)test/e2e/networks/rotate.toml
(2 hunks)test/e2e/pkg/mockcoreserver/core_server.go
(1 hunks)test/e2e/pkg/mockcoreserver/methods.go
(1 hunks)test/e2e/runner/setup.go
(0 hunks)test/e2e/runner/start.go
(1 hunks)types/quorum_sign_data.go
(1 hunks)version/version.go
(1 hunks)
💤 Files with no reviewable changes (3)
- test/e2e/runner/setup.go
- config/toml.go
- config/config.go
✅ Files skipped from review due to trivial changes (2)
- light/client_test.go
- light/client.go
🧰 Additional context used
🪛 Markdownlint (0.37.0)
spec/abci++/api.md
431-431: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines
(MD022, blanks-around-headings)
439-439: null
Link fragments should be valid
(MD051, link-fragments)
945-945: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines
(MD022, blanks-around-headings)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: tests (02)
- GitHub Check: e2e-test (rotate)
- GitHub Check: e2e-test (dashcore)
🔇 Additional comments (31)
version/version.go (1)
14-14
: ABCI version bump aligns with new functionality.The version update from 1.2.0 to 1.3.0 correctly follows semantic versioning for introducing the new
FinalizeSnapshot
method.Let's verify this version is consistently used across the codebase:
✅ Verification successful
ABCI version update is consistently implemented.
The version bump from 1.2.0 to 1.3.0 is properly defined and consistently used across the codebase through the centralized
version.ABCIVersion
constant. No conflicting version strings were found.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for any hardcoded ABCI version strings rg -g '!version/version.go' '"1\.[0-9]\.[0-9]"'Length of output: 48492
Script:
#!/bin/bash # Search for ABCI version references more specifically rg -i "abci.*version|version.*abci" # Also check for any proto definitions that might reference ABCI versions fd -e proto | xargs rg -i "version|abci"Length of output: 37694
test/e2e/runner/start.go (1)
91-91
: Simplified error handling looks good.The removal of intermediate block variables and direct error checking from
waitForHeight
aligns with the removal of trust parameters from state sync configuration.test/e2e/pkg/mockcoreserver/methods.go (1)
68-68
: Correctly includes QuorumHash in unmarshal parameters.The addition of
&cmd.QuorumHash
ensures proper unmarshalling of the quorum verification request.abci/types/messages.go (1)
90-94
: New ToRequestFinalizeSnapshot function follows established patterns.The implementation correctly converts RequestFinalizeSnapshot to the generic Request type, maintaining consistency with other message conversion functions in the file.
abci/types/application.go (1)
19-20
: LGTM! Well-documented interface addition.The new
FinalizeSnapshot
method in theStateSyncer
interface is clearly documented and aligns with the state sync enhancement objectives.test/e2e/pkg/mockcoreserver/core_server.go (1)
149-150
: LGTM! Good simplification of return value.The change simplifies the code by directly returning the boolean value instead of wrapping it in a struct, while maintaining the same functionality.
internal/statesync/mocks/stateprovider.go (1)
148-205
: LGTM! Correctly generated mock implementation.The new
LightBlock
mock method follows the established patterns for error handling and return values, properly supporting the state sync enhancement.types/quorum_sign_data.go (1)
164-164
: Verify the impact of changing the hash size constant.The validation now uses
crypto.QuorumHashSize
instead ofcrypto.DefaultHashSize
. Please ensure:
- The new constant exists and has the correct value
- All existing quorum hashes comply with the new size requirement
✅ Verification successful
QuorumHashSize constant is correctly defined and used consistently
The validation now uses the semantically correct
crypto.QuorumHashSize
constant which equalscrypto.DefaultHashSize
(32 bytes). The codebase consistently validates quorum hashes against this size through multiple validation checks, and test files demonstrate correct usage of the constant.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the existence and usage of QuorumHashSize # Check if QuorumHashSize is defined rg "QuorumHashSize.*=.*\d+" -A 2 # Find all quorum hash declarations to verify size compatibility ast-grep --pattern 'QuorumHash []byte'Length of output: 75
Script:
#!/bin/bash # Broader search for hash size constants and usages # Find QuorumHashSize constant definition and usage rg "QuorumHashSize" -B 2 -A 2 # Find DefaultHashSize for comparison rg "DefaultHashSize" -B 2 -A 2 # Find quorum hash related code with more context rg "QuorumHash" -B 2 -A 2 # Look specifically in crypto package fd -t f . "crypto" --exec cat {}Length of output: 80387
internal/proxy/client.go (1)
215-218
: LGTM! Implementation follows established patterns.The
FinalizeSnapshot
method is well-implemented, following the same patterns as other methods in the file:
- Consistent error handling
- Proper timing metrics
- Correct delegation to the underlying client
abci/client/grpc_client.go (1)
258-260
: LGTM! Implementation follows established patterns.The
FinalizeSnapshot
method is well-implemented, following the same patterns as other methods in the file:
- Proper request conversion using
ToRequestFinalizeSnapshot
- Consistent use of
grpc.WaitForReady(true)
option- Correct error handling
internal/statesync/chunks.go (2)
136-144
: LGTM! Improved handling of empty chunks.The changes improve the robustness of chunk handling:
- Maintains essential nil check for the chunk itself
- Safely handles empty chunk content by replacing nil with empty byte slice
- Prevents potential nil pointer dereference while allowing valid empty chunks
164-164
: LGTM! Consistent use of modified data.Correctly uses the potentially modified
data
variable for writing, ensuring consistency with the empty chunk handling changes.abci/client/routed_client.go (1)
354-357
: LGTM! Implementation follows established patterns.The
FinalizeSnapshot
method is well-implemented, following the same patterns as other methods in the file:
- Consistent use of delegate pattern
- Proper type assertion for response
- Correct error handling
internal/statesync/stateprovider.go (3)
43-44
: LGTM: Well-structured interface additionThe new
LightBlock
method follows the existing interface pattern and is properly documented.
126-131
: LGTM: Thread-safe implementation of LightBlockThe implementation correctly reuses the existing
verifyLightBlockAtHeight
method and maintains thread safety through proper mutex usage.
271-276
: LGTM: Consistent implementation across providersThe P2P implementation maintains consistency with the RPC implementation, ensuring proper thread safety and error handling.
internal/statesync/syncer.go (2)
353-354
: LGTM: Enhanced sync process with light blocksThe change improves the sync process by retrieving complete light blocks instead of just commits, providing more comprehensive block data for verification.
581-599
: LGTM: Robust snapshot finalization implementationThe implementation properly handles snapshot finalization by sending light blocks to the ABCI app and includes appropriate logging.
abci/types/mocks/application.go (1)
Line range hint
261-422
: LGTM! The new FinalizeSnapshot mock implementation follows established patterns.The implementation correctly handles:
- Error cases with appropriate panic messages
- Type assertions and conversions
- Method signatures matching the interface
- Supporting mock types for testing
internal/statesync/reactor_test.go (1)
239-240
: LGTM! Mock expectation for FinalizeSnapshot is correctly configured.The mock is set up to be called exactly once and return an empty response, which is appropriate for this test scenario.
internal/statesync/reactor.go (1)
239-240
: LGTM! Trust-related configuration has been cleanly removed.The changes consistently remove trust-related parameters from both P2P and RPC state provider initialization while maintaining proper logging.
Also applies to: 252-253, 261-261
abci/client/mocks/client.go (1)
365-422
: LGTM! The new FinalizeSnapshot mock implementation follows established patterns.The implementation correctly handles:
- Error cases with appropriate panic messages
- Type assertions and conversions
- Method signatures matching the interface
- Supporting mock types for testing
test/e2e/networks/rotate.toml (1)
155-155
: LGTM! State sync configuration changes look good.The changes consistently configure p2p-based state sync for both validator06 and full01 nodes, aligning with the removal of trust-based parameters.
Also applies to: 195-195
abci/client/socket_client.go (1)
360-366
: LGTM! Clean implementation of FinalizeSnapshot method.The implementation follows the established pattern for ABCI methods in the socket client, with proper error handling and request/response flow.
go.mod (1)
12-13
: Verify compatibility of updated dependencies.Multiple dependency updates have been made. Please ensure:
- The new version of dashd-go (v0.26.1-0.20250113154619-015ef93ad36e) is stable and compatible
- The btcec/v2 v2.2.0 update doesn't introduce breaking changes
- The upgrade to stable versions (btclog v1.0.0) is intentional
Also applies to: 83-83, 111-111
✅ Verification successful
Dependencies update appears safe to proceed
The verification shows these are indirect dependencies with no direct usage in the codebase. The version updates follow semantic versioning principles, suggesting backward compatibility. While breaking changes exist in dashd-go, they don't appear to affect this codebase as there's no direct usage of the updated components.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check for any known issues or breaking changes in the updated dependencies # Check dashd-go changelog/releases gh api repos/dashpay/dashd-go/releases | jq -r '.[].body' | grep -i "breaking change\|security" # Check btcec changelog gh api repos/dashpay/dashd-go/contents/btcec/v2/CHANGELOG.md | jq -r '.content' | base64 -d | grep -A 5 "## v2.2.0"Length of output: 356
Script:
#!/bin/bash # Search for import patterns and potential breaking changes echo "=== Checking import patterns ===" rg -l "github.com/dashpay/dashd-go/btcec/v2" --type go echo -e "\n=== Checking documentation ===" fd -e md -e txt . | xargs rg -l "dashd-go|btcec" || true echo -e "\n=== Checking recent changes ===" git log -n 5 --oneline -- go.mod echo -e "\n=== Checking current usages ===" rg "btcec\..*" --type go -A 2Length of output: 454
Script:
#!/bin/bash # Check for indirect dependencies and upgrade guides echo "=== Checking go.mod for indirect dependencies ===" cat go.mod | grep -A 10 "require" echo -e "\n=== Looking for upgrade documentation ===" fd -e md . | xargs rg -l "upgrade|migration|breaking|changelog" || true echo -e "\n=== Checking dashd-go main repository ===" gh api repos/dashpay/dashd-go/contents/docs | jq -r '.[].name' || true gh api repos/dashpay/dashd-go/contents/UPGRADING.md | jq -r '.content' | base64 -d || trueLength of output: 4572
proto/tendermint/abci/types.proto (2)
187-193
: LGTM! Well-defined RequestFinalizeSnapshot message.The message includes the essential light_block field needed for state reconstruction after snapshot sync.
908-908
: LGTM! Service definition properly updated.The FinalizeSnapshot RPC method is correctly added to the ABCIApplication service.
internal/statesync/syncer_test.go (4)
103-106
: LGTM: Commit structure properly includes Height field.The commit structure now correctly includes the Height field, which is essential for proper block validation.
108-115
: LGTM: Well-structured LightBlock implementation.The LightBlock structure properly encapsulates both the Commit and Header, following the standard blockchain block structure pattern.
131-132
: LGTM: StateProvider mock updated to use LightBlock.The mock has been correctly updated to return a LightBlock instead of just the commit, aligning with the new state sync protocol.
268-270
: LGTM: FinalizeSnapshot mock properly configured.The FinalizeSnapshot mock is correctly set up with proper expectations and return values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
proto/tendermint/abci/types.proto (2)
186-196
: Fix documentation formatting.The documentation has some minor formatting issues:
- Missing period after "e.g"
- Missing blank line before heading
// RequestFinalizeSnapshot is called by Tenderdash after successfully applying all snapshot chunks, eg. +// RequestFinalizeSnapshot is called by Tenderdash after successfully applying all snapshot chunks, e.g., // when the ABCI application returned `ResponseApplySnapshotChunk` with `Result = ACCEPT`. // It includes the light block committed at the synced height, which Tenderdash uses to reconstruct its own state. // The application should validate the light block against its restored state. + // If the application fails to validate the light block, it should return error in the response. // This is considered fatal, non-recoverable consensus failure and will cause Tenderdash to restart.
Line range hint
947-956
: Fix documentation formatting.Add a blank line before the heading.
<a name="tendermint-abci-ResponseFinalizeSnapshot"></a> + ### ResponseFinalizeSnapshot The response to a `RequestPrepareSnapshot` message.
abci/example/kvstore/kvstore.go (2)
583-600
: Consider enhancing the logging for better observability.The debug logging could be more detailed to help with troubleshooting. Consider adding logging for the validation steps and error cases.
func (app *Application) FinalizeSnapshot(_ctx context.Context, req *abci.RequestFinalizeSnapshot) (*abci.ResponseFinalizeSnapshot, error) { app.mu.Lock() defer app.mu.Unlock() + app.logger.Debug("Starting snapshot finalization", + "expected_height", app.LastCommittedState.GetHeight(), + "expected_app_hash", app.LastCommittedState.GetAppHash(), + "snapshot_height", req.LightBlock.SignedHeader.Header.Height, + "snapshot_app_hash", req.LightBlock.SignedHeader.Header.AppHash) + // we verify snapshot height and app hash, as there is no additional logic to be called here if app.LastCommittedState.GetHeight() != req.LightBlock.SignedHeader.Header.Height { + app.logger.Error("Snapshot height mismatch", + "expected", app.LastCommittedState.GetHeight(), + "got", req.LightBlock.SignedHeader.Header.Height) return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot height mismatch: expected %d, got %d", app.LastCommittedState.GetHeight(), req.LightBlock.SignedHeader.Header.Height) } if !app.LastCommittedState.GetAppHash().Equal(req.LightBlock.SignedHeader.Header.AppHash) { + app.logger.Error("Snapshot app hash mismatch", + "expected", app.LastCommittedState.GetAppHash(), + "got", req.LightBlock.SignedHeader.Header.AppHash) return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot apphash mismatch: expected %x, got %x", app.LastCommittedState.GetAppHash(), req.LightBlock.SignedHeader.Header.AppHash) } app.logger.Debug("FinalizeSnapshot finished successfully", "req", req) return &abci.ResponseFinalizeSnapshot{}, nil }
583-600
: Well-integrated state sync implementation.The
FinalizeSnapshot
method is well-integrated into the state sync workflow, following the same patterns as other ABCI methods (mutex locking, error handling, logging). It complements the existing snapshot-related methods (OfferSnapshot
,ApplySnapshotChunk
, etc.) to provide a complete state synchronization solution.The implementation demonstrates good practices:
- Consistent error handling across all snapshot-related methods
- Thread-safety through mutex usage
- Clear separation of concerns between different snapshot operations
- Proper validation of snapshot metadata before finalization
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
abci/types/types.pb.go
is excluded by!**/*.pb.go
📒 Files selected for processing (4)
abci/example/kvstore/kvstore.go
(1 hunks)abci/types/application.go
(2 hunks)proto/tendermint/abci/types.proto
(5 hunks)spec/abci++/api.md
(7 hunks)
🧰 Additional context used
🪛 LanguageTool
spec/abci++/api.md
[duplication] ~431-~431: Possible typo: you repeated a word.
Context: ...abci-RequestFinalizeSnapshot"> ### RequestFinalizeSnapshot RequestFinalizeSnapshot is called by Tenderdash after successfu...
(ENGLISH_WORD_REPEAT_RULE)
[uncategorized] ~432-~432: The abbreviation “e.g.” (= for example) requires two periods.
Context: ...cessfully applying all snapshot chunks, eg. when the ABCI application returned `Res...
(E_G)
🪛 Markdownlint (0.37.0)
spec/abci++/api.md
431-431: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines
(MD022, blanks-around-headings)
443-443: null
Link fragments should be valid
(MD051, link-fragments)
949-949: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines
(MD022, blanks-around-headings)
🔇 Additional comments (5)
abci/types/application.go (2)
19-20
: LGTM! Interface addition is well-documented.The FinalizeSnapshot method is properly added to the StateSyncer interface with clear documentation.
102-105
: LGTM! Base implementation provides good default behavior.The no-op implementation in BaseApplication follows the pattern used for other methods.
proto/tendermint/abci/types.proto (1)
Line range hint
252-252
: LGTM! Protocol changes are well-structured.The new finalize_snapshot field is properly added to Request/Response messages and the ABCIApplication service.
Also applies to: 836-836, 1402-1402
spec/abci++/api.md (1)
431-444
: Documentation needs enhancement.The FinalizeSnapshot documentation should be expanded to include:
- When exactly Tenderdash calls this method in the state sync process
- What the application should do with the light block
- Error handling expectations
- Impact on the node's state
Consider adding a detailed section similar to other ABCI methods:
### RequestFinalizeSnapshot #### Usage - Called by Tenderdash after successfully applying all snapshot chunks - The light block allows the application to verify the final state - Applications should validate the light block against their restored state - Any errors during finalization should trigger a state sync retry #### When does Tenderdash call it? Detailed sequence of events...🧰 Tools
🪛 LanguageTool
[duplication] ~431-~431: Possible typo: you repeated a word.
Context: ...abci-RequestFinalizeSnapshot"> ### RequestFinalizeSnapshot RequestFinalizeSnapshot is called by Tenderdash after successfu...(ENGLISH_WORD_REPEAT_RULE)
[uncategorized] ~432-~432: The abbreviation “e.g.” (= for example) requires two periods.
Context: ...cessfully applying all snapshot chunks, eg. when the ABCI application returned `Res...(E_G)
🪛 Markdownlint (0.37.0)
431-431: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines(MD022, blanks-around-headings)
443-443: null
Link fragments should be valid(MD051, link-fragments)
abci/example/kvstore/kvstore.go (1)
583-600
: LGTM! The implementation looks solid.The
FinalizeSnapshot
method correctly validates both the height and AppHash of the snapshot against the last committed state, with proper error handling and thread safety.A few notable strengths of this implementation:
- Thread-safe with mutex locking
- Clear, descriptive error messages
- Proper debug logging
- Consistent with ABCI protocol requirements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
proto/tendermint/abci/types.proto (1)
186-196
: Documentation needs enhancement.The documentation should be expanded to include:
- When exactly Tenderdash calls this method in the state sync process
- What the application should do with the light block
- Error handling expectations
- Impact on the node's state
Consider adding a detailed section similar to other ABCI methods:
### RequestFinalizeSnapshot #### Usage - Called by Tenderdash after successfully applying all snapshot chunks - The light block allows the application to verify the final state - Applications should validate the light block against their restored state - Any errors during finalization should trigger a state sync retry #### When does Tenderdash call it? Detailed sequence of events...
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
abci/types/types.pb.go
is excluded by!**/*.pb.go
📒 Files selected for processing (2)
proto/tendermint/abci/types.proto
(5 hunks)spec/abci++/api.md
(7 hunks)
🧰 Additional context used
🪛 Markdownlint (0.37.0)
spec/abci++/api.md
431-431: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines
(MD022, blanks-around-headings)
443-443: null
Link fragments should be valid
(MD051, link-fragments)
949-949: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines
(MD022, blanks-around-headings)
⏰ Context from checks skipped due to timeout of 90000ms (12)
- GitHub Check: test_apps
- GitHub Check: test_abci_cli
- GitHub Check: tests (05)
- GitHub Check: tests (04)
- GitHub Check: tests (03)
- GitHub Check: tests (02)
- GitHub Check: e2e-test (rotate)
- GitHub Check: tests (01)
- GitHub Check: Super linter
- GitHub Check: tests (00)
- GitHub Check: e2e-test (dashcore)
- GitHub Check: golangci-lint
🔇 Additional comments (5)
proto/tendermint/abci/types.proto (3)
35-35
: LGTM! Field addition follows protobuf best practices.The finalize_snapshot field is properly added to the Request message with an appropriate field number.
529-529
: LGTM! Field addition follows protobuf best practices.The finalize_snapshot field is properly added to the Response message with an appropriate field number.
Line range hint
1402-1402
: LGTM! Service definition is properly updated.The FinalizeSnapshot RPC is correctly added to the ABCIApplication service.
spec/abci++/api.md (2)
429-449
: Documentation needs enhancement.The documentation feedback from the proto file review applies here as well.
🧰 Tools
🪛 Markdownlint (0.37.0)
431-431: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines(MD022, blanks-around-headings)
443-443: null
Link fragments should be valid(MD051, link-fragments)
21-21
: LGTM! Table of contents is properly updated.The table of contents correctly reflects the new FinalizeSnapshot messages.
Also applies to: 39-39
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
internal/statesync/syncer.go (4)
8-8
: Remove debug-only import.The
os
package is only used for debug operations. Consider removing it along with the associated debug code.
174-176
: Validate retry counter logic.The retry counter check should happen before attempting to get the best snapshot to avoid unnecessary processing.
- if retries > 0 && snapshot == nil && iters > retries { + if retries > 0 && iters > retries { return sm.State{}, nil, errNoSnapshots }
375-377
: Improve error message for invalid initial height.The error message should be more descriptive about the expected valid range.
- return sm.State{}, nil, fmt.Errorf("initial genesis height %d is invalid", state.InitialHeight) + return sm.State{}, nil, fmt.Errorf("initial genesis height must be greater than 0, got %d", state.InitialHeight)
608-635
: Enhance error handling in finalizeSnapshot.The method has good validation but could be improved:
- Add validation for snapshot parameter
- Enhance error wrapping for better debugging
func (s *syncer) finalizeSnapshot(ctx context.Context, snapshot *snapshot, genesisBlock *types.LightBlock, snapshotBlock *types.LightBlock) error { + if snapshot == nil { + return fmt.Errorf("nil snapshot provided") + } s.logger.Info("Finalizing snapshot restoration", "snapshot", snapshot.Hash.String(), "height", snapshot.Height, @@ -629,7 +632,7 @@ _, err = s.conn.FinalizeSnapshot(ctx, &abci.RequestFinalizeSnapshot{ SnapshotBlock: snapshotBlockProto, GenesisBlock: genesisBlockProto, - }) + }); if err != nil { return fmt.Errorf("ABCI FinalizeSnapshot call failed: %w", err) } return err }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/statesync/syncer.go
(12 hunks)
🧰 Additional context used
📓 Learnings (1)
internal/statesync/syncer.go (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: internal/statesync/syncer.go:380-384
Timestamp: 2025-01-24T08:11:01.354Z
Learning: For genesis height conversion from int64 to uint64, using MustConvertUint64 is preferred over error handling since the height is already validated to be positive and cannot exceed uint64 range.
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: test_apps
- GitHub Check: test_abci_cli
- GitHub Check: build
- GitHub Check: tests (02)
- GitHub Check: tests (01)
- GitHub Check: e2e-test (rotate)
- GitHub Check: golangci-lint
- GitHub Check: Super linter
- GitHub Check: e2e-test (dashcore)
🔇 Additional comments (1)
internal/statesync/syncer.go (1)
379-383
: LGTM! Proper use of MustConvertUint64.Based on the retrieved learning from @lklimek, using
MustConvertUint64
is appropriate here since the height is already validated to be positive and cannot exceed uint64 range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
internal/statesync/chunks.go (1)
302-315
: Add thread-safety documentation.While the implementation is correct, consider adding documentation about thread-safety guarantees for better API clarity.
-// Pending returns a list of all chunks that have been requested but not yet received. +// Pending returns a list of all chunks that have been requested but not yet received. +// This method is thread-safe as it acquires the queue's mutex lock.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/statesync/chunks.go
(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (8)
- GitHub Check: test_abci_cli
- GitHub Check: test_apps
- GitHub Check: build
- GitHub Check: tests (03)
- GitHub Check: e2e-test (rotate)
- GitHub Check: tests (01)
- GitHub Check: e2e-test (dashcore)
- GitHub Check: Super linter
🔇 Additional comments (3)
internal/statesync/chunks.go (3)
107-110
: LGTM! Enhanced retry mechanism.The change improves the retry mechanism by resetting chunk status to
initStatus
when re-enqueueing existing chunks, allowing for proper re-fetching.
138-146
: LGTM! Improved nil safety handling.The change properly handles nil chunk content by replacing it with an empty byte slice, maintaining nil safety while allowing empty chunks.
296-299
: LGTM! Enhanced timeout error reporting.The change improves error reporting by including the count of pending chunks in timeout errors, making it easier to diagnose synchronization issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
internal/p2p/router.go (1)
360-362
: Consider making the queue timeout configurable.The 1-minute timeout for queue operations is hardcoded. Consider making this configurable through RouterOptions to allow flexibility based on different network conditions or requirements.
type RouterOptions struct { // ... existing fields ... + // QueueTimeout is the timeout for queue operations. Defaults to 1 minute. + QueueTimeout time.Duration } func (o *RouterOptions) Validate() error { // ... existing validation ... + if o.QueueTimeout == 0 { + o.QueueTimeout = time.Minute + } return nil }Then update the timeout creation:
-qCtx, qcancel := context.WithTimeout(ctx, time.Minute) +qCtx, qcancel := context.WithTimeout(ctx, r.options.QueueTimeout)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/p2p/router.go
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: test_apps
- GitHub Check: e2e-test (rotate)
- GitHub Check: e2e-test (dashcore)
🔇 Additional comments (2)
internal/p2p/router.go (2)
363-370
: LGTM! Enhanced logging for chunk messages.The addition of debug logs for chunk messages will help with troubleshooting state sync issues, which aligns well with the PR's objectives.
376-378
: LGTM! Proper timeout handling.The timeout case in the select statement ensures graceful handling of queue timeouts with appropriate debug logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
internal/statesync/chunks.go (2)
148-156
: Simplify mutex management.The manual mutex management with
unlockFn
is complex and potentially error-prone. Consider usingdefer q.mtx.Unlock()
and restructuring the code to avoid the need for manual unlocking before channel operations.- // we need to manage lock manually to securely unlock before sending to applyCh - locked := true - unlockFn := func() { - if locked { - q.mtx.Unlock() - locked = false - } - } - defer unlockFn() + defer q.mtx.Unlock()
317-329
: Optimize memory allocation in Pending method.The method pre-allocates a slice with capacity equal to the total number of items, but it might only need a fraction of that size. Consider using a smaller initial capacity or a more memory-efficient approach.
- waiting := make([]bytes.HexBytes, 0, len(q.items)) + // Start with a smaller capacity as we expect most chunks to be received + waiting := make([]bytes.HexBytes, 0, len(q.items)/4)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/statesync/chunks.go
(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (6)
- GitHub Check: test_abci_cli
- GitHub Check: test_apps
- GitHub Check: build
- GitHub Check: e2e-test (rotate)
- GitHub Check: Super linter
- GitHub Check: e2e-test (dashcore)
🔇 Additional comments (2)
internal/statesync/chunks.go (2)
18-23
: LGTM! Well-defined error variables.The error variables are well-defined with clear, specific messages that help in error handling and debugging.
388-401
: LGTM! Well-implemented helper method.The
getItem
helper method is well-documented, has proper error handling, and maintains the mutex contract clearly in its documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
internal/libs/sync/mutexguard.go (4)
1-5
: Consider renaming the package to avoid conflict with standard library.The package name
sync
might conflict with Go's standard library package of the same name. Consider using a more specific name likesyncutil
ormutexutil
to avoid potential confusion.-package sync +package syncutil
7-20
: LGTM! Consider more descriptive interface names.The interfaces are well-defined and documented. They follow the interface segregation principle and match the standard library's mutex behaviors. Consider using more descriptive names like
Locker
andRLocker
to better reflect their purpose, matching the standard library'ssync.Locker
interface.-type Mtx interface { +type Locker interface { Lock() Unlock() } -type RMtx interface { +type RLocker interface { RLock() RUnlock() }
22-37
: Add nil check for defensive programming.The implementation is thread-safe and effectively prevents double-unlock issues. However, it would be more robust with a nil check.
func LockGuard(mtx Mtx) UnlockFn { + if mtx == nil { + panic("mutex is nil") + } mtx.Lock() locked := true
39-54
: Consider code reuse and add nil check.The implementation is correct but duplicates the pattern from
LockGuard
. Consider extracting the common pattern and adding a nil check.Here's a suggested refactor to reduce duplication:
+// guardFn is a helper type for lock/unlock function pairs +type guardFn struct { + lock func() + unlock func() +} + +func newGuard(fn guardFn) UnlockFn { + if fn.lock == nil || fn.unlock == nil { + panic("invalid guard functions") + } + fn.lock() + locked := true + return func() bool { + if locked { + locked = false + fn.unlock() + return true + } + return false + } +} + func LockGuard(mtx Mtx) UnlockFn { + if mtx == nil { + panic("mutex is nil") + } - mtx.Lock() - locked := true - - return func() bool { - if locked { - locked = false - mtx.Unlock() - return true - } - return false - } + return newGuard(guardFn{ + lock: mtx.Lock, + unlock: mtx.Unlock, + }) } func RLockGuard(mtx RMtx) UnlockFn { + if mtx == nil { + panic("mutex is nil") + } - mtx.RLock() - locked := true - - return func() bool { - if locked { - locked = false - mtx.RUnlock() - return true - } - return false - } + return newGuard(guardFn{ + lock: mtx.RLock, + unlock: mtx.RUnlock, + }) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
.mockery.yaml
(1 hunks)internal/consensus/state.go
(1 hunks)internal/libs/sync/mutexguard.go
(1 hunks)internal/statesync/chunks.go
(4 hunks)internal/statesync/chunks_test.go
(4 hunks)internal/statesync/mocks/consensusstateprovider.go
(1 hunks)internal/statesync/reactor.go
(19 hunks)internal/statesync/reactor_test.go
(16 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- internal/statesync/chunks_test.go
- internal/statesync/chunks.go
- internal/statesync/reactor.go
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: test_apps
- GitHub Check: e2e-test (rotate)
- GitHub Check: tests (00)
- GitHub Check: e2e-test (dashcore)
🔇 Additional comments (6)
internal/statesync/mocks/consensusstateprovider.go (1)
1-128
: LGTM! The mock implementation is well-structured.The auto-generated mock follows best practices:
- Proper error handling for unspecified return values
- Support for both direct value returns and function callbacks
- Clean initialization with automatic expectation verification
internal/consensus/state.go (1)
310-312
: LGTM! The GetCurrentHeight implementation is clean and thread-safe.The method correctly retrieves the current height from the thread-safe stateDataStore.
internal/statesync/reactor_test.go (3)
85-92
: LGTM! The setup function signature update is appropriate.The addition of csState parameter allows for proper testing of the ConsensusStateProvider functionality.
241-241
: LGTM! The FinalizeSnapshot mock expectation is well-defined.The mock expectation correctly verifies that FinalizeSnapshot is called exactly once with the right parameters.
772-786
: LGTM! The retryUntil helper function improves test reliability.The function properly handles retries with timeout and context cancellation.
.mockery.yaml (1)
56-56
: LGTM! The mockery configuration update is correct.ConsensusStateProvider is properly added to the statesync package interfaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
internal/statesync/syncer.go (2)
175-178
: Consider improving retry condition.The retry condition could be more clearly expressed.
-if retries > 0 && snapshot == nil && iters > retries { +if retries > 0 && snapshot == nil && iters >= retries {
551-553
: Consider improving error handling.The error handling could be more robust here.
-s.logger.Error("failed to request snapshot chunk", "err", err, "chunkID", ID) -// retry the chunk -s.chunkQueue.Enqueue(ID) +s.logger.Error("failed to request snapshot chunk, will retry", + "err", err, + "chunkID", ID, + "retry_timeout", s.retryTimeout) +s.chunkQueue.Enqueue(ID) + +// Allow some time before retry to avoid overwhelming the network +time.Sleep(100 * time.Millisecond)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/statesync/syncer.go
(12 hunks)
🧰 Additional context used
📓 Learnings (1)
internal/statesync/syncer.go (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: internal/statesync/syncer.go:380-384
Timestamp: 2025-01-24T08:11:01.354Z
Learning: For genesis height conversion from int64 to uint64, using MustConvertUint64 is preferred over error handling since the height is already validated to be positive and cannot exceed uint64 range.
⏰ Context from checks skipped due to timeout of 90000ms (10)
- GitHub Check: test_abci_cli
- GitHub Check: test_apps
- GitHub Check: build
- GitHub Check: tests (05)
- GitHub Check: tests (04)
- GitHub Check: tests (02)
- GitHub Check: tests (01)
- GitHub Check: e2e-test (rotate)
- GitHub Check: tests (00)
- GitHub Check: e2e-test (dashcore)
🔇 Additional comments (6)
internal/statesync/syncer.go (6)
33-34
: LGTM: Timeout constant added for chunk requests.The new constant
chunkRequestSendTimeout
with a value of 5 seconds is appropriate for network operations.
377-379
: LGTM: Initial height validation added.Good addition of validation to ensure initial height is valid.
381-385
: LGTM: Genesis block retrieval.Correctly using
MustConvertUint64
for height conversion as per previous learnings.
605-632
: LGTM: Well-implemented snapshot finalization.The new
finalizeSnapshot
method:
- Has proper null checks
- Includes informative logging
- Handles proto conversion errors
- Uses clear error messages
530-537
: LGTM: Improved debug logging.Enhanced logging in the fetchChunks method improves observability.
599-602
: LGTM: Added timeout for chunk requests.Good addition of context timeout for chunk requests using the new constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
internal/statesync/chunks.go (1)
405-418
: Enhance error handling documentation and messages.Consider improving the error handling:
- Document possible error types (
errNilSnapshot
and custom errors) in the method comments.- Make the error message more specific about why the chunk wasn't found (e.g., "chunk not in queue" vs "chunk not found").
// getItem fetches chunk item from the items map. If the item is not found, it returns an error. +// Returns: +// - errNilSnapshot if the queue is closed +// - error if the chunk is not in the queue // The caller must hold the mutex lock. func (q *chunkQueue) getItem(chunkID bytes.HexBytes) (*chunkItem, error) { if q.snapshot == nil { return nil, errNilSnapshot } chunkIDKey := chunkID.String() item, ok := q.items[chunkIDKey] if !ok { - return nil, fmt.Errorf("chunk %x not found", chunkID) + return nil, fmt.Errorf("chunk %x not in queue", chunkID) } return item, nil }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/statesync/chunks.go
(7 hunks)
🧰 Additional context used
📓 Learnings (1)
internal/statesync/chunks.go (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: internal/statesync/chunks.go:304-306
Timestamp: 2025-02-05T10:12:23.371Z
Learning: In the state sync chunk queue's timeout path, the locking in `q.Pending()` is sufficient as it's the only lock in this execution path, making the error reporting thread-safe.
⏰ Context from checks skipped due to timeout of 90000ms (11)
- GitHub Check: test_apps
- GitHub Check: test_abci_cli
- GitHub Check: build
- GitHub Check: tests (05)
- GitHub Check: tests (04)
- GitHub Check: tests (02)
- GitHub Check: tests (01)
- GitHub Check: e2e-test (rotate)
- GitHub Check: e2e-test (dashcore)
- GitHub Check: golangci-lint
- GitHub Check: Super linter
🔇 Additional comments (3)
internal/statesync/chunks.go (3)
154-206
: LGTM! Well-structured changes to improve robustness.The changes improve the method by:
- Adding better error handling with detailed messages
- Using proper mutex handling to avoid deadlocks
- Supporting empty chunks
- Unlocking before channel operations to prevent deadlocks
304-330
: LGTM! Thread-safe timeout handling with improved error reporting.The changes improve error reporting by including the count of pending chunks in the timeout error message. The locking is correctly handled through
q.Pending()
.
332-345
: LGTM! Efficient implementation with proper locking.The method is well-implemented with:
- Proper mutex handling
- Efficient slice pre-allocation
- Clear status checks for pending chunks
@coderabbitai review all |
@lklimek: I'll perform a full review of all changes in the PR. ✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (26)
libs/math/safemath.go (1)
262-289
: Dynamic max function.
Max[T Integer]()
provides a wide coverage across all integer types. The panic in thedefault
case clearly signals unsupported types, which is good.Consider adding a short doc comment that clarifies the function’s usage and mentions the panic for unsupported types.
internal/statesync/syncer.go (4)
33-33
: Consider Making the Timeout Configurable
DefiningchunkRequestSendTimeout
as a constant is acceptable, but allowing it to be overridden via configuration can make it more flexible for different environments.-const chunkRequestSendTimeout = 5 * time.Second +// Optionally make this configurable in the config +const chunkRequestSendTimeout = 5 * time.Second
176-178
: Validate Boundary Condition
Whenretries > 0 && snapshot == nil && iters > retries
, the code returnserrNoSnapshots
. Confirm thatiters > retries
(rather than>=
) is the intended boundary check.
599-603
: Per-Send Context Timeout
Wrappings.chunkCh.Send
with acontext.WithTimeout
helps prevent indefinite blocking when the channel is congested.
605-632
: Suggest Additional Validations
finalizeSnapshot
correctly checks for nil blocks and callsFinalizeSnapshot
on the ABCI app. For completeness, consider verifying that thesnapshotBlock
height matchessnapshot.Height
and thatgenesisBlock
is indeed the correct initial block for the chain.internal/p2p/router.go (1)
360-365
: Per-Message Enqueue Timeout
Wrapping the enqueue operation incontext.WithTimeout(ctx, time.Minute)
prevents the router from indefinitely blocking. Consider making this duration configurable.internal/statesync/reactor.go (5)
72-73
: Consider makingminPeers
configurable.Currently hard-coded to 2, the TODO hints this may need to adapt to different network scenarios. Configurability could help operators tune the state sync process.
141-148
: Expand documentation on the new interface.
ConsensusStateProvider
looks straightforward. Including docstrings on possible side effects and expected error scenarios can ensure clarity for implementers.
170-170
: Re-check parameter positioning.Adding
csState
at the end is logical for optional parameters. If this is mandatory, consider reordering so that optional arguments remain last.
251-251
: Evaluate log level for initialization message.Currently using
Info
for an internal initialization log. Consider downgrading toDebug
in high-traffic or production systems to reduce log noise.
337-337
: Repeated peer checks.Calling
waitForEnoughPeers
again ensures consistency but might be refactored for better maintenance if repeated often.types/validator_test.go (2)
119-119
: Fix typo in test function name.The test function name has a typo: "TestValdiatorSetHash" should be "TestValidatorSetHash".
-func TestValdiatorSetHash(t *testing.T) { +func TestValidatorSetHash(t *testing.T) {
128-128
: Remove commented out code.The commented out
require.NoError(t, err)
line appears to be unnecessary.- // require.NoError(t, err)
libs/math/safemath_test.go (1)
88-191
: Consider using table-driven subtests for better test organization.While the test cases are comprehensive, using
t.Run()
with table-driven subtests would improve test output and failure isolation.Here's how you could refactor the test:
func TestSafeConvert(t *testing.T) { testCases := []struct { + name string from interface{} want interface{} err bool }{ - {int(0), int64(0), false}, + { + name: "int zero to int64", + from: int(0), + want: int64(0), + err: false, + }, // ... other test cases ... } for i, tc := range testCases { + t.Run(tc.name, func(t *testing.T) { var result interface{} var err error // ... rest of the test logic ... + }) } }abci/types/application.go (1)
102-105
: Consider enhancing the documentation comment.The current comment could be more descriptive about when this method is called and what it means for the application.
-// FinalizeSnapshot provides a no-op implementation for the StateSyncer interface +// FinalizeSnapshot is called after successful state sync completion to provide the application +// with the light block at the sync height. This allows the application to verify the state +// matches the expected hash and perform any necessary finalization steps. The default +// implementation is a no-op. func (BaseApplication) FinalizeSnapshot(_ context.Context, _req *RequestFinalizeSnapshot) (*ResponseFinalizeSnapshot, error) {config/toml.go (1)
501-505
: LGTM! Consider enhancing the retry documentation.The removal of trust-based parameters and addition of retry mechanism looks good. Consider adding a note about the impact of retries on state sync duration.
# Number of times to retry state sync. When retries are exhausted, the node will # fall back to the regular block sync. Set to 0 to disable retries. Default is 3. # Note that in pessimistic case, it will take at least (discovery-time * retries) before -# falling back to block sync. +# falling back to block sync. For example, with discovery-time=30s and retries=3, +# it may take up to 90 seconds before falling back to block sync. retries = {{ .StateSync.Retries }}abci/example/kvstore/kvstore.go (1)
583-606
: LGTM! Consider adding metrics for snapshot finalization.The implementation correctly verifies the snapshot state before finalization. Consider adding metrics to track successful/failed snapshot finalizations.
func (app *Application) FinalizeSnapshot(_ctx context.Context, req *abci.RequestFinalizeSnapshot) (*abci.ResponseFinalizeSnapshot, error) { app.mu.Lock() defer app.mu.Unlock() + + // Track snapshot finalization metrics + defer func(start time.Time) { + app.metrics.StateSync.FinalizeSnapshotTime.Observe(time.Since(start).Seconds()) + }(time.Now()) // we only run some verifications here if app.LastCommittedState.GetHeight() != req.SnapshotBlock.SignedHeader.Header.Height {internal/statesync/reactor_test.go (1)
444-446
: Consider adding test cases for consensus state provider.The consensus state provider is added but only used for height validation. Consider adding test cases for error scenarios.
// Example test case: func TestReactor_Sync_ConsensusStateError(t *testing.T) { cp := mocks.NewConsensusStateProvider(t) cp.On("GetCurrentHeight").Return(int64(0)).Once() rts := setup(ctx, t, nil, nil, cp, 100) // Test error handling... }internal/mempool/reactor.go (1)
97-97
: Adjusted Logging Level for Peer Updates
Switching from a higher log level toTrace
for logging peer updates provides finer-grained output. Ensure that your deployment’s logging configuration filters trace logs in production to avoid log flooding.internal/blocksync/reactor.go (1)
162-162
: Changed Log Level to Trace for Peer Updates
The change to useTrace
instead of a more prominent log level (e.g., Debug) for recording peer update events enhances traceability while keeping production logs cleaner. Confirm that trace-level logging is enabled only when detailed diagnostics are needed.internal/evidence/reactor.go (1)
204-204
: Logging Level Adjusted for Peer Updates
Updating the log statement toTrace
for peer updates aligns with the overall logging refinement across the codebase. Verify that this additional verbosity is controlled by the log configuration in production environments.proto/tendermint/abci/types.proto (1)
913-914
: New FinalizeSnapshot RPC Method:
The service definition now includes the FinalizeSnapshot RPC. This addition correctly routes FinalizeSnapshot requests to the ABCI application. Please ensure that its usage is thoroughly covered by unit and end-to-end tests.spec/abci++/api.md (4)
430-447
: API Documentation – RequestFinalizeSnapshot Section:
The new section documents when and how RequestFinalizeSnapshot is used—including the fieldssnapshot_block
andgenesis_block
—and explains that the application should validate the restored state. It might be beneficial to add an explicit note that any validation failure should result in an error returned directly (thus causing a restart), to reinforce the operational expectations.🧰 Tools
🪛 LanguageTool
[duplication] ~431-~431: Possible typo: you repeated a word.
Context: ...abci-RequestFinalizeSnapshot"> ### RequestFinalizeSnapshot RequestFinalizeSnapshot is called by Tenderdash after successfu...(ENGLISH_WORD_REPEAT_RULE)
🪛 markdownlint-cli2 (0.17.2)
431-431: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below(MD022, blanks-around-headings)
443-443: Link fragments should be valid
null(MD051, link-fragments)
444-444: Link fragments should be valid
null(MD051, link-fragments)
949-956
: API Documentation – ResponseFinalizeSnapshot Section:
The documentation for ResponseFinalizeSnapshot is concise, reflecting its empty structure as required. For clarity, consider adding a brief note that any snapshot finalization error must be communicated as an RPC error rather than through this empty message.🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
950-950: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below(MD022, blanks-around-headings)
431-431
: Documentation Formatting:
Static analysis highlights that headings should be surrounded by blank lines (e.g. before “### RequestFinalizeSnapshot”). Please ensure to update the markdown formatting accordingly.🧰 Tools
🪛 LanguageTool
[duplication] ~431-~431: Possible typo: you repeated a word.
Context: ...abci-RequestFinalizeSnapshot"> ### RequestFinalizeSnapshot RequestFinalizeSnapshot is called by Tenderdash after successfu...(ENGLISH_WORD_REPEAT_RULE)
🪛 markdownlint-cli2 (0.17.2)
431-431: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below(MD022, blanks-around-headings)
950-950
: Documentation Formatting:
Similarly, ensure that the “### ResponseFinalizeSnapshot” heading is surrounded by blank lines to meet markdownlint guidelines.🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
950-950: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below(MD022, blanks-around-headings)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
abci/types/types.pb.go
is excluded by!**/*.pb.go
go.sum
is excluded by!**/*.sum
📒 Files selected for processing (43)
.mockery.yaml
(1 hunks)abci/client/grpc_client.go
(1 hunks)abci/client/mocks/client.go
(1 hunks)abci/client/routed_client.go
(1 hunks)abci/client/socket_client.go
(2 hunks)abci/example/kvstore/kvstore.go
(1 hunks)abci/types/application.go
(2 hunks)abci/types/messages.go
(1 hunks)abci/types/mocks/application.go
(1 hunks)config/config.go
(3 hunks)config/toml.go
(1 hunks)go.mod
(3 hunks)internal/blocksync/reactor.go
(1 hunks)internal/consensus/state.go
(1 hunks)internal/consensus/state_data.go
(1 hunks)internal/evidence/reactor.go
(1 hunks)internal/libs/sync/mutexguard.go
(1 hunks)internal/mempool/reactor.go
(1 hunks)internal/p2p/router.go
(1 hunks)internal/p2p/transport_mconn.go
(1 hunks)internal/proxy/client.go
(1 hunks)internal/statesync/chunks.go
(7 hunks)internal/statesync/chunks_test.go
(4 hunks)internal/statesync/mocks/consensusstateprovider.go
(1 hunks)internal/statesync/mocks/stateprovider.go
(1 hunks)internal/statesync/peer.go
(2 hunks)internal/statesync/reactor.go
(19 hunks)internal/statesync/reactor_test.go
(16 hunks)internal/statesync/stateprovider.go
(5 hunks)internal/statesync/syncer.go
(12 hunks)internal/statesync/syncer_test.go
(12 hunks)libs/math/safemath.go
(3 hunks)libs/math/safemath_test.go
(1 hunks)proto/tendermint/abci/types.proto
(5 hunks)spec/abci++/api.md
(7 hunks)test/e2e/networks/rotate.toml
(2 hunks)test/e2e/pkg/mockcoreserver/core_server.go
(1 hunks)test/e2e/pkg/mockcoreserver/methods.go
(1 hunks)test/e2e/runner/setup.go
(0 hunks)test/e2e/runner/start.go
(1 hunks)types/quorum_sign_data.go
(1 hunks)types/validator_test.go
(2 hunks)version/version.go
(1 hunks)
💤 Files with no reviewable changes (1)
- test/e2e/runner/setup.go
🧰 Additional context used
📓 Learnings (5)
libs/math/safemath_test.go (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: libs/math/safemath.go:123-161
Timestamp: 2025-01-20T13:38:10.707Z
Learning: The `SafeConvert` function in `libs/math/safemath.go` correctly handles negative values during integer type conversion through two mechanisms: explicit checks for int64 to uint64 conversion, and the Min[U]() check that catches negative values when converting to unsigned types.
proto/tendermint/abci/types.proto (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: proto/tendermint/abci/types.proto:645-648
Timestamp: 2025-01-20T11:38:49.099Z
Learning: In ABCI protocol, FinalizeSnapshot call should return error directly for validation failures rather than using error fields in ResponseFinalizeSnapshot message, as these failures are considered fatal consensus failures that should cause Tenderdash to restart.
internal/statesync/chunks.go (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: internal/statesync/chunks.go:304-306
Timestamp: 2025-02-05T10:12:23.371Z
Learning: In the state sync chunk queue's timeout path, the locking in `q.Pending()` is sufficient as it's the only lock in this execution path, making the error reporting thread-safe.
libs/math/safemath.go (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: libs/math/safemath.go:123-161
Timestamp: 2025-01-20T13:38:10.707Z
Learning: The `SafeConvert` function in `libs/math/safemath.go` correctly handles negative values during integer type conversion through two mechanisms: explicit checks for int64 to uint64 conversion, and the Min[U]() check that catches negative values when converting to unsigned types.
internal/statesync/syncer.go (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: internal/statesync/syncer.go:380-384
Timestamp: 2025-01-24T08:11:01.354Z
Learning: For genesis height conversion from int64 to uint64, using MustConvertUint64 is preferred over error handling since the height is already validated to be positive and cannot exceed uint64 range.
🪛 LanguageTool
spec/abci++/api.md
[duplication] ~431-~431: Possible typo: you repeated a word.
Context: ...abci-RequestFinalizeSnapshot"> ### RequestFinalizeSnapshot RequestFinalizeSnapshot is called by Tenderdash after successfu...
(ENGLISH_WORD_REPEAT_RULE)
🪛 markdownlint-cli2 (0.17.2)
spec/abci++/api.md
431-431: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
443-443: Link fragments should be valid
null
(MD051, link-fragments)
444-444: Link fragments should be valid
null
(MD051, link-fragments)
950-950: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
🔇 Additional comments (124)
internal/statesync/chunks_test.go (4)
4-4
: LGTM! Good improvement in error handling.Adding the
errors
package enables the use oferrors.Is()
for more robust error type checking.
104-107
: LGTM! Enhanced test failure messages.Including the test case index in error messages improves debugging by making it clear which test case failed.
248-249
: LGTM! Improved error handling practices.The changes from direct error comparison to
errors.Is()
andrequire.ErrorIs()
follow Go's best practices for error handling, properly handling wrapped errors.Also applies to: 288-288
1-368
: Verify test coverage for state sync functionality.While this file has good test coverage for chunk queue functionality, please ensure that the new
FinalizeSnapshot
functionality mentioned in the PR objectives is covered in other test files.Let's check for test coverage of the new functionality:
libs/math/safemath.go (16)
11-11
: New error constant is fine.
DeclaringErrOverflowUint64
helps ensure more precise error reporting for operations involving uint64 overflows.
15-15
: Broad overflow error is acceptable.
ErrOverflow
serves as a general catch-all and can be reused in multiple scenarios, providing good flexibility.
99-102
: Delegate safely to the generic converter.
UsingSafeConvert[T, uint64]
withinSafeConvertUint64()
centralizes logic in one place, reducing duplication.
104-107
: Consistent approach.
SafeConvertInt64()
likewise leverages the generic helper, maintaining a single overflow-handling pathway.
109-112
: Concise usage.
Similarly,SafeConvertInt16()
relies onSafeConvert
, which is a clean and DRY practice.
114-117
: No issues found.
SafeConvertUint16()
similarly benefits from the generic approach for overflow checks.
123-161
: Robust generic conversion logic.
SafeConvert[T, U]
provides a well-encapsulated overflow-check mechanism for various integer types. The special cases for 64-bit conversions and fallback checks withMax[U]()
andMin[U]()
are clearly outlined. Be aware of the small performance overhead from runtime type checks, but otherwise this is solid.
163-170
: Convenient must-convert method.
MustConvert
is useful where callers want to avoid repetitive error handling. Panicking is acceptable here, as the name signals the behavior clearly.
172-174
: Helper to unify usage.
MustConvertUint64
is a straightforward alias to reduce repetitive type parameters. No concerns here.
176-178
: Same pattern, no issues.
MustConvertInt64
also aligns with the same approach of reusing the generic method.
180-182
: Completes the coverage.
MustConvertUint16
adds completeness for smaller integer conversions.
184-186
: Likewise consistent.
MustConvertInt16
continues the pattern without introducing new risks.
188-190
: No concerns raised.
MustConvertUint8
is consistent with the rest of the must-convert suite.
192-194
: Uniform approach.
MustConvertUint
similarly extends the coverage touint
.
196-198
: Completes integer conversions.
MustConvertInt
follows the same safe pattern and is well-named.
291-309
: Dynamic min function with sound default for unsigned.
Min[T Integer]()
addresses the entire integer range, returning 0 for unsigned types. Good job indicating the fallback with a panic for unsupported types.internal/statesync/chunks.go (17)
4-5
: Appropriate imports.
Imports forsha256
andhex
are used for generating a unique chunk filename. No issues here.
14-14
: Namespace alias is fine.
Import aliastmsync
frominternal/libs/sync
to avoid collisions is acceptable for clarity.
70-84
: Filename generation logic is sound.
HashingchunkID
to generate a deterministic file name ensures uniqueness and avoids collision among chunks. The doc comments are clear about concurrency.
156-157
: Guard against nil chunk references.
Checking forchunk == nil
with an immediate error prevents panics or undefined behavior.
160-165
: Ensure not passing a nil byte slice.
This snippet enforces[]byte{}
instead ofnil
, which helps avoid write errors later.
166-167
: Lock guard usage is good.
CallingLockGuard(&q.mtx)
is a neat encapsulation that automatically unlocks when the function returns. Helps reduce risk of forgetting unlock calls.
169-172
: Graceful error handling.
q.getItem(chunk.ID)
surfaces an explicit error message if the chunk item is missing.
179-182
: Chunk validation performed prior to writing.
Ensures the chunk belongs to the correct snapshot and height. Good step to reduce corruption.
184-189
: Obtain filename safely.
item.Filename(q.dir)
sets up the chunk’s file path. Logging the error with chunk ID is helpful for debugging.
190-190
: File writing is done next.
item.write(data)
is executed after ensuring the path is valid, which is the logical sequence.
197-199
: Unlocking before applying.
Releasing the mutex prior to sending onapplyCh
avoids potential deadlocks ifapplyCh
is blocked. This is a sensible concurrency practice.
202-205
: Signal waiters.
Locking again to callitem.closeWaitChs(true)
ensures threadsafe channel closure. This is handled carefully.
327-330
: Timeout error with count of pending chunks.
Providinglen(q.Pending())
in the error clarifies how many chunks remain. This is good for debugging.
333-346
: Pending chunk retrieval.
ThePending()
method aggregates chunk IDs that remain unreceived. Straightforward approach, no concurrency concerns.
356-363
: Reusable helper for chunk re-queuing.
retry()
resets status toinitStatus
and re-appends the chunk torequestQueue
, which is consistent with how chunks are originally enqueued.
406-419
: Get item validation.
getItem()
ensures the queue is open (snapshot != nil
) and the item is present. Returns a descriptive error if not found, which is a well-structured approach.
421-423
: Validation method reference.
validateChunk()
is a good single place to confirm chunk alignment with the snapshot. No concurrency or logic issues spotted.internal/statesync/stateprovider.go (5)
43-44
: New LightBlock method in the interface.
ExposingLightBlock
retrieval broadens the interface's utility and aligns well with other block-specific methods.
126-131
: RPC-based LightBlock retrieval.
LightBlock()
instateProviderRPC
leverages the existingverifyLightBlockAtHeight
logic. This is consistent with the other methods and ensures validated block data.
223-224
: Minimum peers check.
Ensuring at leastminPeers
providers inNewP2PStateProvider
helps maintain a safe threshold for light client security.
227-228
: Light client creation for P2P.
Usinglight.NewClient(..., dashCoreClient, ...)
parallels the RPC version. Properly accepts multiple providers for redundancy.
271-276
: LightBlock in P2P.
Similar lock usage and verification approach asstateProviderRPC
. The consistency helps ensure uniform data retrieval no matter the strategy.internal/statesync/syncer.go (19)
19-19
: No Issues Found
Importingtmmath
looks fine and is consistent with usage elsewhere.
96-96
: Good Log Key Renaming
Using"chunkID", chunk.ID
is more consistent with other log fields.
127-128
: Reduced Logging Noise
Switching fromInfo
toDebug
for snapshots not added helps avoid clutter in production logs.
156-156
: Newretries
Parameter
Introducing aretries int
parameter forSyncAny
increases control over how many times the procedure will attempt to find snapshots.
190-193
: Re-request Snapshots
Re-requesting snapshots upon discovery failure helps ensure a robust sync process. Adding logs or metrics may further assist in debugging repeated failures.
220-220
: Potential Overflow Check
MustConvertInt64(snapshot.Height)
can panic ifsnapshot.Height
exceedsmath.MaxInt64
. Verify that snapshot heights are guaranteed to remain withinint64
range.
347-347
: Improved Error Checking
Usingerrors.Is
to detectlight.ErrNoWitnesses
is more robust and recommended over direct comparisons.
355-356
: Transition to Light Block
Retrieving a light block instead of a commit aligns with the new approach for final verification.
364-364
: Clearer Error Message
Returning a more descriptive error regarding missing witnesses clarifies the failure cause.
366-366
: Straightforward Logging
Logging the failure to verify the light block before discarding the snapshot is consistent with the approach used for other errors.
377-379
: Sanity Check onInitialHeight
EnsuringInitialHeight >= 1
prevents invalid genesis states and fails fast with a clear error.
381-385
: Consistent with Learnings onMustConvertUint64
Convertingstate.InitialHeight
touint64
here is valid, since we have a prior check ensuring it is positive.
387-390
: Validate Snapshot Finalization
Immediately returning an error iffinalizeSnapshot
fails prevents partially finalized states.
403-403
: Return the LightBlock Commit
Revising the return type to use the block’s commit is consistent with the shift towards light blocks.
530-536
: Helpful Debug Messages
Logging when the fetch chunk queue is empty or times out provides clear traceability when diagnosing synchronization stalls.
542-542
: Added Error Context in Logs
Including"err", err
in the debug statement clarifies why the queue is waiting again.
548-548
: Consistent Log Key
Using"chunkID", ID
matches the naming convention seen in other logging statements.
551-553
: Retry Mechanism on Failure
Re-enqueuing chunks instead of immediately aborting ensures resilience if a chunk request fails at first attempt.
560-561
: Chunk Timeout Retry
Logging a timeout for chunk reception and retrying is a good approach for robust chunk acquisition.internal/statesync/syncer_test.go (7)
86-86
: ExplicitInitialHeight
in Tests
SettingInitialHeight: 1
in the test ensures state init logic is properly checked.
103-115
: LightBlock Construction
Defining both aCommit
and aSignedHeader
in the test suite simulates real block data, enabling more realistic test coverage.
131-132
: MockingLightBlock
Retrieval
Stubbing outOn("LightBlock", ...)
ensures sync logic that depends on the final block data is properly verified.
268-269
: Testing FinalizeSnapshot
Adding a mock expectation forFinalizeSnapshot
validates the snapshot finalization flow and ensures the method is invoked as intended.
280-280
: Zero-Retries Case
CallingSyncAny
with(ctx, 0, 0, ...)
tests the flow without extra attempts, ensuring coverage of the one-shot scenario.
297-298
: Robust Error Path Testing
These lines confirm correct error behavior (errNoSnapshots
,errAbort
, or custom errors) under different snapshot rejection scenarios. This broadens test coverage for edge conditions.Also applies to: 323-324, 373-374, 416-417, 470-471, 498-499
627-628
: Chunk Queue Initialization
Explicitly creating and enqueuing chunks checks that snapshot chunk logic is triggered within tests.internal/p2p/router.go (2)
368-370
: Conditional Logging for ChunkChannel
Debug-logging sent chunk messages increases transparency in the gossip layer while avoiding verbose logs for other channels.
376-378
: Timeout Handling
Dropping a message after the enqueue context times out is a practical safeguard against resource exhaustion.internal/statesync/reactor.go (12)
24-24
: Approved import usage.The new import for
tmmath
is well-placed, providing crucial numeric-safe conversions for subsequent logic.
138-139
: Validate concurrency handling oncsState
.Introducing
csState
can be helpful but be cautious if it's accessed by multiple goroutines. Confirm calls are protected by locks to avoid race conditions.
254-254
: Peer availability pre-check.Verifying peer count before proceeding helps ensure state sync is viable. Good defensive approach.
264-265
: P2P state provider initialization.Removal of trust parameters aligns with the updated approach. The method signature is coherent.
273-273
: RPC state provider update.Similarly removing trust-based parameters. Consistent with the revised design.
292-306
: Fallback logic when no snapshots.Gracefully falling back to block sync is a practical solution if no snapshots are found. Verify that all edge cases (e.g., corrupted snapshots) are covered by tests.
355-355
: ConfirmSyncAny
test coverage.Linking
SyncAny
to configuration values requires thorough testing (especially variedDiscoveryTime
andRetries
) to ensure robust error-handling.
722-722
: Explicit handling of unknown message types.Returning errors for unrecognized channel messages is essential to prevent silent failures or unexpected behaviors.
Also applies to: 803-803, 864-864, 912-912
997-997
: Trace-level logging for peer updates.Using
Trace
helps keep the logs less cluttered in production. Consider switching toDebug
if these events become critical to track.Also applies to: 1006-1006
1064-1069
: Guard againstnil
consensus state.If
csState
is absent, returning an empty snapshot list avoids potential nil pointer errors. This is a solid defensive check.
1095-1101
: Filtering snapshots close to current height.Skipping snapshots that are too recent ensures that a finalized commit is available. Correct approach for stable snapshot data.
1117-1117
: Beware of possible overflow inMustConvertInt64
.
MustConvertInt64
will panic on extreme inputs. Consider a more graceful failure if heights exceedint64
limits.version/version.go (1)
14-14
: New ABCI semantic version.Updating
ABCISemVer
to 1.3.0 reflects added functionalities. Ensure this is consistently documented throughout release notes.internal/libs/sync/mutexguard.go (5)
1-5
: Definition ofUnlockFn
.Returning a boolean status is a helpful mechanism to detect redundant unlock attempts.
7-12
: MinimalMtx
interface.This design allows interchangeability with standard or custom mutex implementations.
14-20
:RMtx
interface for read locks.Similarly provides flexibility for read locks and potential custom read-write lock implementations.
22-37
:LockGuard
function.Encapsulating lock/unlock behavior in this higher-level construct simplifies usage and promotes safer concurrency patterns.
39-54
:RLockGuard
function.Mirrors the read-lock side effectively. Both guard functions improve usability and reduce the risk of forgetting unlock calls.
test/e2e/runner/start.go (1)
91-91
: LGTM! Code simplification improves readability.The changes effectively simplify the waitForHeight logic by removing unnecessary intermediate variables while maintaining the same functionality.
test/e2e/pkg/mockcoreserver/methods.go (1)
68-68
: LGTM! Proper handling of QuorumHash parameter.The change correctly includes QuorumHash in the unmarshalCmd parameters, ensuring proper verification functionality.
internal/statesync/mocks/consensusstateprovider.go (1)
1-127
: LGTM! Well-structured auto-generated mock.The auto-generated mock follows standard patterns and includes proper error handling, type assertions, and documentation.
internal/statesync/peer.go (1)
99-99
: LGTM! Appropriate logging level adjustment.The change from Info to Trace level is appropriate for these peer update logs, as they are frequent operational events that don't need high visibility.
Also applies to: 109-109
abci/types/messages.go (1)
90-94
: LGTM! Consistent implementation.The new
ToRequestFinalizeSnapshot
function follows the established pattern for request conversion functions and integrates well with the state sync feature.abci/types/application.go (1)
19-20
: LGTM! Well-integrated interface addition.The new
FinalizeSnapshot
method is a logical addition to theStateSyncer
interface for completing the state sync process.internal/statesync/mocks/stateprovider.go (2)
148-176
: Implementation follows mock patterns correctly.The LightBlock method implementation follows the established mock patterns with proper error handling and type checking.
178-205
: Helper methods are properly implemented.The mock helper methods (Run, Return, RunAndReturn) are correctly implemented following the established patterns.
types/quorum_sign_data.go (1)
164-166
: Verify the QuorumHash size constant change.The validation now uses
crypto.QuorumHashSize
instead ofcrypto.DefaultHashSize
. This change could affect existing code that relies on the previous validation.Run the following script to check for any other occurrences of QuorumHash validation:
✅ Verification successful
🏁 Script executed:
#!/bin/bash # Search for QuorumHash size validations rg -A 2 "QuorumHash.*len|len.*QuorumHash" # Search for DefaultHashSize usage rg -A 2 "DefaultHashSize"Length of output: 19354
QuorumHash validation change verified.
The updated validation in
types/quorum_sign_data.go
now correctly usescrypto.QuorumHashSize
. This change is intentional and clarifies that the quorum hash is being validated with a dedicated constant. Note that incrypto/crypto.go
,QuorumHashSize
is defined as equal tocrypto.DefaultHashSize
(32), so the change does not alter the effective size check but improves semantic clarity for quorum-specific code.
types/quorum_sign_data.go
now validates usingcrypto.QuorumHashSize
.crypto/crypto.go
shows thatQuorumHashSize
is assigned the same value asDefaultHashSize
.- Other parts of the code continue to use
DefaultHashSize
where appropriate, so the change is localized and intentional.internal/proxy/client.go (1)
215-218
: Implementation follows proxy patterns correctly.The FinalizeSnapshot method is properly implemented with metrics timing and follows the established proxy patterns.
abci/client/grpc_client.go (1)
258-260
: Implementation follows gRPC patterns correctly.The FinalizeSnapshot method is properly implemented with WaitForReady and follows the established gRPC patterns.
internal/consensus/state.go (1)
310-312
: LGTM! Clean implementation of height accessor.The
GetCurrentHeight
method provides a clean and thread-safe way to access the current height from the state data store.abci/client/routed_client.go (1)
354-357
: LGTM! Consistent implementation of ABCI method.The
FinalizeSnapshot
implementation follows the established pattern for ABCI methods in the routed client, properly handling delegation and response type casting.abci/client/socket_client.go (2)
360-366
: LGTM! Consistent implementation of ABCI method.The
FinalizeSnapshot
implementation follows the established pattern for ABCI methods in the socket client, properly handling request/response flow and error cases.
440-441
: LGTM! Proper request/response type matching.The addition of FinalizeSnapshot case in
resMatchesReq
ensures proper type matching for the new message type.internal/p2p/transport_mconn.go (1)
187-187
: Verify impact of default port change.The default port has been changed from 26657 to 26656. This change could affect network connectivity for existing deployments.
Please confirm:
- Is this change intentional and documented?
- Has this been communicated to node operators?
- Are there any migration steps needed?
abci/types/mocks/application.go (1)
261-318
: LGTM! Auto-generated mock implementation.The new
FinalizeSnapshot
mock method and related types follow the established pattern for mock implementations.internal/statesync/reactor_test.go (2)
241-242
: LGTM! Mock expectation for FinalizeSnapshot.The mock expectation correctly verifies that FinalizeSnapshot is called exactly once.
772-786
: LGTM! Helper function for retry logic.The
retryUntil
helper function provides a clean way to handle retries in tests.abci/client/mocks/client.go (1)
365-422
: LGTM! The FinalizeSnapshot mock implementation is well-structured.The implementation follows the established pattern in the codebase, with proper error handling, type assertions, and the standard mock method structure.
config/config.go (2)
1006-1011
: Well-documented retry configuration.The documentation clearly explains the retry behavior and its implications, particularly noting the pessimistic case timing with discovery-time.
1065-1067
: LGTM! Proper validation for the new retries field.The validation ensures retries cannot be negative, maintaining configuration integrity.
.mockery.yaml (1)
56-56
: LGTM! ConsensusStateProvider interface added correctly.The addition follows the file's structure and maintains proper organization.
test/e2e/networks/rotate.toml (2)
155-155
: LGTM! Enabled P2P state sync for validator06.The configuration aligns with the new state sync implementation.
195-195
: LGTM! Switched full01 to P2P state sync.The change maintains consistency with the new state sync approach across different node types.
test/e2e/pkg/mockcoreserver/core_server.go (1)
149-150
: Simplified Result Assignment inQuorumVerify
The inline initialization of thebtcjson.QuorumVerifyResult
with the computedsignatureVerified
is clear and succinct. Please verify that this simplification preserves all intended behavior from the previous implementation.internal/consensus/state_data.go (1)
122-122
: Minor Comment Correction
The typographical correction in the comment (changing "nad" to "and") improves readability without affecting functionality.go.mod (3)
12-13
: Dependency Update – dashd-go and btcec/v2:
The versions for “github.com/dashpay/dashd-go” (updated to v0.26.1) and its indirect dependency “github.com/dashpay/dashd-go/btcec/v2” (updated to v2.2.0) have been bumped. Please ensure that these updates are fully compatible with the new FinalizeSnapshot functionality (and any other state sync-related features) across the codebase.
84-84
: Stable Version Upgrade – btclog:
The upgrade from a pseudo-version to “github.com/btcsuite/btclog v1.0.0” is a positive change for stability. Verify that no API conflicts occur due to this change.
112-112
: Dependency Bump – btcutil:
Updating “github.com/dashpay/dashd-go/btcutil” to v1.3.0 aligns with the other dependency updates. Please double-check that any API changes in this library are handled appropriately in the FinalizeSnapshot and related implementations.proto/tendermint/abci/types.proto (4)
35-35
: FinalizeSnapshot Field Integration:
A new fieldfinalize_snapshot
(tag 20) has been added to the oneof block in the Request message. This integration is clear; just verify that tag 20 does not conflict with any reserved or previously used tags.
193-198
: New RequestFinalizeSnapshot Message:
The new message correctly defines two fields—snapshot_block
andgenesis_block
—using the LightBlock type. This design meets the PR objective to allow state validation upon snapshot finalization. Please ensure that the application’s validation logic uses these fields appropriately.
531-532
: FinalizeSnapshot Response in Oneof:
The addition of thefinalize_snapshot
field (with tag 17) in the Response oneof is consistent with our design: the response remains empty so that any errors are returned as RPC errors rather than via response fields.
647-650
: New ResponseFinalizeSnapshot Message:
The newly introduced response message is intentionally empty, matching our consensus that in case of validation issues the ABCI application should return an error directly. This is aligned with previous design discussions and learnings.spec/abci++/api.md (2)
21-22
: TOC Update – RequestFinalizeSnapshot Entry:
The Table of Contents has been updated appropriately to include a link to the new RequestFinalizeSnapshot section. This helps users quickly locate the new functionality in the documentation.
39-40
: TOC Update – ResponseFinalizeSnapshot Entry:
Similarly, adding the ResponseFinalizeSnapshot entry to the TOC provides a complete view of the new message types added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
libs/math/safemath.go (2)
99-117
: Enhance function documentation.While the functions are well-implemented, their documentation could be more descriptive. Consider adding:
- Parameter descriptions
- Return value descriptions
- Example usage
Apply this diff to improve the documentation:
-// SafeConvertUint64 takes a int and checks if it overflows. +// SafeConvertUint64 safely converts a value of type T to uint64. +// +// Parameters: +// - a: The value to convert +// +// Returns: +// - uint64: The converted value +// - error: ErrOverflow if the conversion would overflow +// +// Example: +// val, err := SafeConvertUint64(int32(42))
262-311
: Consider enhancing error handling in Max/Min functions.While the implementation is solid, consider replacing panics with errors for better error handling in production code.
Apply this diff to improve error handling:
-func Max[T Integer]() uint64 { +func Max[T Integer]() (uint64, error) { var max T switch any(max).(type) { // ... existing cases ... default: - panic("unsupported type") + return 0, fmt.Errorf("unsupported type %T", max) } } -func Min[T Integer]() int64 { +func Min[T Integer]() (int64, error) { switch any(T(0)).(type) { // ... existing cases ... default: - panic("unsupported type") + return 0, fmt.Errorf("unsupported type %T", T(0)) } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
libs/math/safemath.go
(3 hunks)
🧰 Additional context used
📓 Learnings (1)
libs/math/safemath.go (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: libs/math/safemath.go:123-161
Timestamp: 2025-01-20T13:38:10.707Z
Learning: The `SafeConvert` function in `libs/math/safemath.go` correctly handles negative values during integer type conversion through two mechanisms: explicit checks for int64 to uint64 conversion, and the Min[U]() check that catches negative values when converting to unsigned types.
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: test_abci_cli
- GitHub Check: test_apps
- GitHub Check: e2e-test (rotate)
- GitHub Check: e2e-test (dashcore)
🔇 Additional comments (3)
libs/math/safemath.go (3)
11-11
: LGTM! Error variables are well-defined.The new error variables follow the established pattern and provide appropriate granularity for error handling.
Also applies to: 15-15
123-161
: LGTM! Robust implementation with comprehensive edge case handling.The implementation correctly handles:
- Special cases for int64/uint64 conversions
- Platform-specific integer sizes
- Negative value conversions
- Overflow checks using Max/Min bounds
163-198
: LGTM! Well-implemented panic wrappers.The MustConvert functions provide a clean API for cases where panics are acceptable, and the error messages are now correctly implemented using a zero value of the target type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (7)
types/validator_test.go (3)
3-14
: Remove unused import.The
fmt
package is imported but not used in the code.- "fmt"
118-119
: Fix typo in function name.The function name contains a typo: "Valdiator" should be "Validator".
-func TestValdiatorSetHashVectors(t *testing.T) { +func TestValidatorSetHashVectors(t *testing.T) {
118-135
: Enhance test documentation and coverage.The test uses hardcoded test vectors without documenting their source or significance. Consider:
- Adding comments explaining the source/significance of the test vectors
- Using test table pattern for multiple test cases
- Adding comments explaining the expected hash calculation
Here's a suggested refactor:
-// TestValdiatorSetHashVectors checks if provided validator threshold pubkey and quorum hash returns expected hash -func TestValdiatorSetHashVectors(t *testing.T) { +// TestValidatorSetHashVectors verifies the ValidatorSet.Hash() computation using known test vectors. +// These test vectors were generated from [add source/reference here]. +func TestValidatorSetHashVectors(t *testing.T) { + testCases := []struct { + name string + thresholdPubKeyB64 string + quorumHashHex string + expectedHashHex string + }{ + { + name: "known good vector", + thresholdPubKeyB64: "gw5F5F5kFNnWFUc8woFOaxccUI+cd+ixaSS3RZT2HJlWpvoWM16YRn6sjYvbdtGH", + quorumHashHex: "703ee5bfc78765cc9e151d8dd84e30e196ababa83ac6cbdee31a88a46bba81b9", + expectedHashHex: "81742F95E99EAE96ABC727FE792CECB4996205DE6BFC88AFEE1F60B96BC648B2", + }, + // Add more test vectors here + } + + for _, tc := range testCases { + t.Run(tc.name, func(t *testing.T) { + thresholdPublicKey, err := base64.RawStdEncoding.DecodeString(tc.thresholdPubKeyB64) + require.NoError(t, err) + + quorumHash, err := hex.DecodeString(tc.quorumHashHex) + require.NoError(t, err) + + valset := ValidatorSet{ + ThresholdPublicKey: bls12381.PubKey(thresholdPublicKey), + QuorumHash: quorumHash, + } + + hash := valset.Hash() + assert.Equal(t, tc.expectedHashHex, hash.String()) + }) + }internal/libs/sync/mutexguard_test.go (1)
104-141
: Consider making sleep durations configurable.While the test effectively simulates real-world scenarios with processing delays, fixed sleep durations can make tests flaky on different systems or under load.
Consider making the sleep durations configurable or using channels/contexts for more deterministic timing:
- time.Sleep(10 * time.Millisecond) // Simulate read delay + time.Sleep(testReadDelay) // Simulate read delay- time.Sleep(10 * time.Millisecond) // Simulate write delay + time.Sleep(testWriteDelay) // Simulate write delayWhere
testReadDelay
andtestWriteDelay
are configurable durations, possibly passed as test parameters or defined as constants that can be adjusted based on the test environment.internal/statesync/stateprovider.go (1)
274-279
: Consider reducing code duplication.The LightBlock implementation is identical in both stateProviderRPC and stateProviderP2P.
Consider extracting the common implementation to a shared helper function or base struct:
+func getVerifiedLightBlock(s interface { + sync.Locker + verifyLightBlockAtHeight(ctx context.Context, height uint64, ts time.Time) (*types.LightBlock, error) +}, ctx context.Context, height uint64) (*types.LightBlock, error) { + s.Lock() + defer s.Unlock() + return s.verifyLightBlockAtHeight(ctx, height, time.Now()) +} func (s *stateProviderP2P) LightBlock(ctx context.Context, height uint64) (*types.LightBlock, error) { - s.Lock() - defer s.Unlock() - return s.verifyLightBlockAtHeight(ctx, height, time.Now()) + return getVerifiedLightBlock(s, ctx, height) }internal/statesync/syncer.go (1)
552-556
: Enhance error message with context.The error message could be more descriptive to help with debugging.
- s.logger.Error("failed to request snapshot chunk", "err", err, "chunkID", ID) + s.logger.Error("failed to request snapshot chunk from peer", + "err", err, + "chunkID", ID, + "height", snapshot.Height, + "version", snapshot.Version)internal/statesync/reactor_test.go (1)
770-784
: Consider adding backoff to retry logic.While the retry function works, it could be more efficient with exponential backoff.
func retryUntil(ctx context.Context, t *testing.T, fn func() bool, timeout time.Duration) { t.Helper() ctx, cancel := context.WithTimeout(ctx, timeout) defer cancel() + backoff := time.Millisecond for { if fn() { return } require.NoError(t, ctx.Err()) + time.Sleep(backoff) + backoff = time.Duration(float64(backoff) * 1.5) // exponential backoff + if backoff > time.Second { + backoff = time.Second // cap maximum backoff + } } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
internal/libs/sync/mutexguard_test.go
(1 hunks)internal/statesync/reactor.go
(19 hunks)internal/statesync/reactor_test.go
(16 hunks)internal/statesync/stateprovider.go
(6 hunks)internal/statesync/syncer.go
(13 hunks)types/validator_test.go
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/statesync/reactor.go
🧰 Additional context used
📓 Learnings (1)
internal/statesync/syncer.go (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: internal/statesync/syncer.go:380-384
Timestamp: 2025-01-24T08:11:01.354Z
Learning: For genesis height conversion from int64 to uint64, using MustConvertUint64 is preferred over error handling since the height is already validated to be positive and cannot exceed uint64 range.
⏰ Context from checks skipped due to timeout of 90000ms (8)
- GitHub Check: test_abci_cli
- GitHub Check: test_apps
- GitHub Check: tests (04)
- GitHub Check: tests (02)
- GitHub Check: tests (01)
- GitHub Check: e2e-test (rotate)
- GitHub Check: Super linter
- GitHub Check: e2e-test (dashcore)
🔇 Additional comments (8)
internal/libs/sync/mutexguard_test.go (3)
14-42
: Well-structured test for lock guard behavior!The test thoroughly verifies the LockGuard's behavior with multiple unlock attempts, including proper cleanup of deadlock detection settings.
44-65
: Excellent concurrency test design!The test effectively verifies thread safety using multiple goroutines and proper synchronization with WaitGroup.
67-102
: Comprehensive test for read/write lock behavior!The test effectively verifies RWMutex functionality with a good mix of concurrent read and write operations.
internal/statesync/stateprovider.go (2)
43-44
: Clean interface extension!The new LightBlock method follows the same pattern as existing methods in the interface.
129-134
: Clean implementation of LightBlock method!The implementation follows the same pattern as other methods, properly handling locking and reusing existing verification logic.
internal/statesync/syncer.go (2)
607-634
: Well-structured finalization logic!The finalizeSnapshot implementation includes proper null checks, error handling, and clear logging.
379-387
: Appropriate use of MustConvertUint64!The code correctly validates initial height and uses MustConvertUint64 as the height is already validated to be positive.
internal/statesync/reactor_test.go (1)
241-242
: Good test coverage for FinalizeSnapshot!The mock expectations properly verify the new finalization functionality.
Issue being fixed or feature implemented
What was done?
[statesync]
:retries
How Has This Been Tested?
Updated e2e tests, updated unit test.
Breaking Changes
[statesync]
section:trust-height
,trust-hash
,trust-period
FinalizeSnapshot
1.3.0
Checklist:
For repository code-owners and collaborators only
Summary by CodeRabbit
New Features
Configuration
Dependency & Version Updates
Documentation